Geometric Scene Parsing with Hierarchical LSTM

نویسندگان

Zhanglin Peng

Ruimao Zhang

Xiaodan Liang

Xiaobai Liu

Liang Lin

چکیده

This paper addresses the problem of geometric scene parsing, i.e. simultaneously labeling geometric surfaces (e.g. sky, ground and vertical plane) and determining the interaction relations (e.g. layering, supporting, siding and affinity) between main regions. This problem is more challenging than the traditional semantic scene labeling, as recovering geometric structures necessarily requires the rich and diverse contextual information. To achieve these goals, we propose a novel recurrent neural network model, named Hierarchical Long Short-Term Memory (H-LSTM). It contains two coupled sub-networks: the Pixel LSTM (P-LSTM) and the Multi-scale Super-pixel LSTM (MS-LSTM) for handling the surface labeling and relation prediction, respectively. The two sub-networks provide complementary information to each other to exploit hierarchical scene contexts, and they are jointly optimized for boosting the performance. Our extensive experiments show that our model is capable of parsing scene geometric structures and outperforming several state-of-theart methods by large margins. In addition, we show promising 3D reconstruction results from the still images based on the geometric parsing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Graph-Based Dependency Parsing via Hierarchical LSTM Networks

In this paper, we propose a neural graph-based dependency parsing model which utilizes hierarchical LSTM networks on character level and word level to learn word representations, allowing our model to avoid the problem of limited-vocabulary and capture both distributional and compositional semantic information. Our model achieves state-ofthe-art accuracy on Chinese Penn Treebank and competitive...

متن کامل

Single-Image 3D Scene Parsing Using Geometric Commonsense

This paper presents a unified grammatical framework capable of reconstructing a variety of scene types (e.g., urban, campus, country etc.) from a single input image. The key idea of our approach is to study a novel commonsense reasoning framework that mainly exploits two types of prior knowledge: (i) prior distributions over a single dimension of objects, e.g., that the length of a sedan is abo...

متن کامل

Hierarchical Feature For Scene Parsing Using Fully Recurrent Network

In scene parsing, the wide-range contextual information is not effectively encoded. Scene parsing provides segmentation and determines an scene into different regions associated with semantic categories. The main objective of scene parsing is to reduce semantic gap between humans and computer machines on scene understanding. The scenes parsing applications are object detection, text detection o...

متن کامل

Learning Dynamic Hierarchical Models for Anytime Scene Labeling

With increasing demand for efficient image and video analysis, test-time cost of scene parsing becomes critical for many large-scale or time-sensitive vision applications. We propose a dynamic hierarchical model for anytime scene labeling that allows us to achieve flexible trade-offs between efficiency and accuracy in pixel-level prediction. In particular, our approach incorporates the cost of ...

متن کامل

Long Short-term Memory Network over Rhetorical Structure Theory for Sentence-level Sentiment Analysis

Using deep learning models to solve sentiment analysis of sentences is still a challenging task. Long short-term memory (LSTM) network solves the gradient disappeared problem existed in recurrent neural network (RNN), but LSTM structure is linear chain-structure that can’t capture text structure information. Afterwards, Tree-LSTM is proposed, which uses LSTM forget gate to skip sub-trees that h...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Geometric Scene Parsing with Hierarchical LSTM

نویسندگان

چکیده

منابع مشابه

Improved Graph-Based Dependency Parsing via Hierarchical LSTM Networks

Single-Image 3D Scene Parsing Using Geometric Commonsense

Hierarchical Feature For Scene Parsing Using Fully Recurrent Network

Learning Dynamic Hierarchical Models for Anytime Scene Labeling

Long Short-term Memory Network over Rhetorical Structure Theory for Sentence-level Sentiment Analysis

عنوان ژورنال:

اشتراک گذاری